2 research outputs found

    BlogForever: D2.5 Weblog Spam Filtering Report and Associated Methodology

    Get PDF
    This report is written as a first attempt to define the BlogForever spam detection strategy. It comprises a survey of weblog spam technology and approaches to their detection. While the report was written to help identify possible approaches to spam detection as a component within the BlogForver software, the discussion has been extended to include observations related to the historical, social and practical value of spam, and proposals of other ways of dealing with spam within the repository without necessarily removing them. It contains a general overview of spam types, ready-made anti-spam APIs available for weblogs, possible methods that have been suggested for preventing the introduction of spam into a blog, and research related to spam focusing on those that appear in the weblog context, concluding in a proposal for a spam detection workflow that might form the basis for the spam detection component of the BlogForever software

    Αναζήτηση και υπηρεσίες εξατομίκευσης σε ένα κατανεμημένο δίκτυο συνεργαζόμενων ψηφιακών ϐιβλιοθηκών Φυσικής Υψηλής Ενέργειας: αναβάθμιση του λογισμικού του CERN Document Server

    No full text
    CERN, the European Organization for Nuclear research, has been involved for years with the open dissemination of scientific research results. To this end, the CDS Invenio software is being developed and used as a complete digital library system for the storage, preservation, management and distribution of CERN’s scientific work. The software is being constantly refined and improved in order to meet the bibliographic needs of researchers and scientists alike. One of the its current objectives is the collaboration of respective digital libraries, maintained by organizations of similar research interests. Within such a collaboration, searching and personalization services on a distributed network of cooperative digital libraries are being planned. In this thesis, we first examined in depth the basic principles and features of CDS Invenio. On this basis, we made a proposal and its respective implementation that allows for searching and personalization services on a distributed network of cooperative digital libraries. We thoroughly described all the necessary adjustments and additions in achieving our goal. Hosted collections are the core of our design, which refer to different record collections belonging to the members of the distributed network. A range of classes and functions handle the searching of these collections fetching the data, using the hypertext transfer protocol, and then analyzing it. The personalization services , such as personal record collections and personalized notifications about newly added records, are implemented in a similar way for hosted collections. In the course of our work we addressed the presumed uncertainty of network resources’ availability and respective network delays, by maintaining a highly acceptable response time. We also dealt with the inherent drawback of distributed searching, namely the lack of control over the different indexing and ranking capabilities of each member of the distributed network. We concluded that the proposed architecture is an integrated solution for searching and personalization services on a distributed network of cooperative digital libraries, offering commensurate quality with the one of an autonomous digital library. It is also open to extensions and improvements, which may be the subject of future work
    corecore